77 research outputs found
Query-Based Keyphrase Extraction from Long Documents
Transformer-based architectures in natural language processing force input
size limits that can be problematic when long documents need to be processed.
This paper overcomes this issue for keyphrase extraction by chunking the long
documents while keeping a global context as a query defining the topic for
which relevant keyphrases should be extracted. The developed system employs a
pre-trained BERT model and adapts it to estimate the probability that a given
text span forms a keyphrase. We experimented using various context sizes on two
popular datasets, Inspec and SemEval, and a large novel dataset. The presented
results show that a shorter context with a query overcomes a longer one without
the query on long documents
Claim-Dissector: An Interpretable Fact-Checking System with Joint Re-ranking and Veracity Prediction
We present Claim-Dissector: a novel latent variable model for fact-checking
and analysis, which given a claim and a set of retrieved evidences jointly
learns to identify: (i) the relevant evidences to the given claim, (ii) the
veracity of the claim. We propose to disentangle the per-evidence relevance
probability and its contribution to the final veracity probability in an
interpretable way -- the final veracity probability is proportional to a linear
ensemble of per-evidence relevance probabilities. In this way, the individual
contributions of evidences towards the final predicted probability can be
identified. In per-evidence relevance probability, our model can further
distinguish whether each relevant evidence is supporting (S) or refuting (R)
the claim. This allows to quantify how much the S/R probability contributes to
the final verdict or to detect disagreeing evidence.
Despite its interpretable nature, our system achieves results competitive
with state-of-the-art on the FEVER dataset, as compared to typical two-stage
system pipelines, while using significantly fewer parameters. It also sets new
state-of-the-art on FAVIQ and RealFC datasets. Furthermore, our analysis shows
that our model can learn fine-grained relevance cues while using coarse-grained
supervision, and we demonstrate it in 2 ways. (i) We show that our model can
achieve competitive sentence recall while using only paragraph-level relevance
supervision. (ii) Traversing towards the finest granularity of relevance, we
show that our model is capable of identifying relevance at the token level. To
do this, we present a new benchmark TLR-FEVER focusing on token-level
interpretability -- humans annotate tokens in relevant evidences they
considered essential when making their judgment. Then we measure how similar
are these annotations to the tokens our model is focusing on.Comment: updated acknowledgemen
Recommended from our members
Towards a framework for comparing automatic term recognition methods
Automatic Term Recognition focuses on the extraction of words and multi-word expressions that are significant for a given domain. There is a considerable interest in using ATR for automatic metadata generation, creation of thesauri and terminological glossaries, keyword extraction, ontology building, etc. In this paper, we build upon the work done at the University of Sheffield, where a library with a few algorithms for ATR was recently developed. We enrich this library with new ATR algorithms and tools for evaluation. Our aim is to perform an experimental study comparing the base ATR methods as well as their combinations under various conditions. The results of the study indicate that better precision can be usually reached by combining ATR methods using foreground and ATR methods using background knowledge. The created platform is freely available and prepared for extensions by other researchers
Populations of Stored Product Mite Tyrophagus putrescentiae Differ in Their Bacterial Communities
Citation: Erban, T., Klimov, P. B., Smrz, J., Phillips, T. W., Nesvorna, M., Kopecky, J., & Hubert, J. (2016). Populations of Stored Product Mite Tyrophagus putrescentiae Differ in Their Bacterial Communities. Frontiers in Microbiology, 7, 19. doi:10.3389/fmich.2015.01046Background: Tyrophagus putrescentiae colonizes different human-related habitats and feeds on various post harvest foods. The microbiota acquired by these mites can influence the nutritional plasticity in different populations. We compared the bacterial communities of five populations of T putrescentiae and one mixed population of T putrescentiae and T fanetzhangorum collected from different habitats. Material: The bacterial communities of the six mite populations from different habitats and diets were compared by Sanger sequencing of cloned 16S rRNA obtained from amplification with universal eubacterial primers and using bacterial taxon-specific primers on the samples of adults/juveniles or eggs. Microscopic techniques were used to localize bacteria in food boli and mite bodies. The morphological determination of the mite populations was confirmed by analyses of CO1 and ITS fragment genes. Results: The following symbiotic bacteria were found in compared mite populations: Wolbachia (two populations), Cardiniurn (five populations), Bartonella-like (five populations), Blattabacteriurn-like symbiont (three populations), and Solitalea-like (six populations). From 35 identified OTUs97, only Solitalea was identified in all populations. The next most frequent and abundant sequences were Bacillus, Moraxella, Staphylococcus, Kocuria, and Microbacteriurn. We suggest that some bacterial species may occasionally be ingested with food. The bacteriocytes were observed in some individuals in all mite populations. Bacteria were not visualized in food boli by staining, but bacteria were found by histological means in ovaria of Wolbachia infested populations. Conclusion: The presence of Blattabacterium-like, Cardinium, Wolbachia, and Solitalea like in the eggs of T putrescentiae indicates mother to offspring (vertical) transmission. Results of this study indicate that diet and habitats influence not only the ingested bacteria but also the symbiotic bacteria of T putrescentiae
- …